On-line Dialogue Policy Learning with Companion Teaching

نویسندگان

Lu Chen

Runzhe Yang

Cheng Chang

Zihao Ye

Xiang Zhou

Kai Yu

چکیده

On-line dialogue policy learning is the key for building evolvable conversational agent in real world scenarios. Poor initial policy can easily lead to bad user experience and consequently fail to attract sufficient real users for policy training. We propose a novel framework, companion teaching, to include a human teacher in the on-line dialogue policy training loop to address the cold start problem. Here, dialogue policy is trained using not only user’s reward but also teacher’s example action as well as estimated immediate reward at turn level. Simulation experiments showed that, with a small number of human teaching dialogues, the proposed approach can effectively improve user experience at the beginning and smoothly lead to good performance with more user interaction data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Affordable On-line Dialogue Policy Learning

The key to building an evolvable dialogue system in real-world scenarios is to ensure an affordable on-line dialogue policy learning, which requires the on-line learning process to be safe, efficient and economical. But in reality, due to the scarcity of real interaction data, the dialogue system usually grows slowly. Besides, the poor initial dialogue policy easily leads to bad user experience...

متن کامل

On-Line Learning of a Persian Spoken Dialogue System Using Real Training Data

The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (g) on the learning speed...

متن کامل

On-Line Learning of a Persian Spoken Dialogue System Using Real Training Data

متن کامل

Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning

Hand-crafted rules and reinforcement learning (RL) are two popular choices to obtain dialogue policy. The rule-based policy is often reliable within predefined scope but not self-adaptable, whereas RL is evolvable with data but often suffers from a bad initial performance. We employ a companion learning framework to integrate the two approaches for on-line dialogue policy learning, in which a p...

متن کامل

Reward Estimation for Dialogue Policy Optimisation

Viewing dialogue management as a reinforcement learning task enables a system to learn to act optimally by maximising a reward function. This reward function is designed to induce the system behaviour required for the target application and for goal-oriented applications, this usually means fulfilling the user’s goal as efficiently as possible. However, in real-world spoken dialogue system appl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

On-line Dialogue Policy Learning with Companion Teaching

نویسندگان

چکیده

منابع مشابه

Affordable On-line Dialogue Policy Learning

On-Line Learning of a Persian Spoken Dialogue System Using Real Training Data

On-Line Learning of a Persian Spoken Dialogue System Using Real Training Data

Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning

Reward Estimation for Dialogue Policy Optimisation

عنوان ژورنال:

اشتراک گذاری